PDTB-style Discourse Annotation of Chinese Text

نویسندگان

  • Yuping Zhou
  • Nianwen Xue
چکیده

We describe a discourse annotation scheme for Chinese and report on the preliminary results. Our scheme, inspired by the Penn Discourse TreeBank (PDTB), adopts the lexically grounded approach; at the same time, it makes adaptations based on the linguistic and statistical characteristics of Chinese text. Annotation results show that these adaptations work well in practice. Our scheme, taken together with other PDTB-style schemes (e.g. for English, Turkish, Hindi, and Czech), affords a broader perspective on how the generalized lexically grounded approach can flesh itself out in the context of cross-linguistic annotation of discourse relations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Inferring Discourse Relations from PDTB-style Discourse Labels for Argumentative Revision Classification

Penn Discourse Treebank (PDTB)-style annotation focuses on labeling local discourse relations between text spans and typically ignores larger discourse contexts. In this paper we propose two approaches to infer discourse relations in a paragraph-level context from annotated PDTB labels. We investigate the utility of inferring such discourse information using the task of revision classification....

متن کامل

An Annotation System for Development of Chinese Discourse Corpus

Well-annotated discourse corpora facilitate the discourse researches. Unlike English, the Chinese discourse corpus is not widely available yet. In this paper, we present a webbased annotation system to develop a Chinese discourse corpus with much finer annotation. We first review our previous corpora from the practical point of view, then propose a flexible annotation framework, and finally dem...

متن کامل

Annotation And Data Mining Of The Penn Discourse TreeBank

The Penn Discourse TreeBank (PDTB) is a new resource built on top of the Penn Wall Street Journal corpus, in which discourse connectives are annotated along with their arguments. Its use of standoff annotation allows integration with a stand-off version of the Penn TreeBank (syntactic structure) and PropBank (verbs and their arguments), which adds value for both linguistic discovery and discour...

متن کامل

Turkish Discourse Bank: Porting a discourse annotation style to a morphologically rich language

This paper describes the current state of the Turkish Discourse Bank, the first publicly available annotated discourse resource for Turkish. It describes the annotation methods and the challenges posed by annotating Turkish, a free word order language with rich morphology. It shows the usefulness of the PDTB style annotation but points out the need to expand this annotation style with the needs...

متن کامل

The Penn Discourse TreeBank 2.0

We present the second version of the Penn Discourse Treebank, PDTB-2.0, describing its lexically-grounded annotations of discourse relations and their two abstract object arguments over the 1 million word Wall Street Journal corpus. We describe all aspects of the annotation, including (a) the argument structure of discourse relations, (b) the sense annotation of the relations, and (c) the attri...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012